In this analysis, I estimate the bias for each public pollster active in the last 6 congressional elections. My final estimate identifies Gallup(Low-Turnout) as the most conservative pollster and AP/Ipsos as the most liberal. I likewise estimate bias of various sampling universes. Next, I use these biases to estimate the true level of support for Democrats over time in each cycle. Lastly, I regress the final estimate of support in each cycle against the number of seats Democrats won.
I have two primary sources of data: past polls and election results. The poll response that I use is the ‘generic Congressional ballot.’ Each pollster has a slightly different wording (and hence why we measure pollster bias), but they are all similar to: ‘If the elections for the U.S. House of Representatives were being held today, which party’s candidate would you vote for in your congressional district: The Democratic candidate or the Republcian candidate?’ The named Congressional ballot question would account for incumbency effects and more closely mirror the choice voters are making in the voting booth. However, since not all candidates are known for 2018 yet, this is the only current question being polled, and so for comparability, I will use the same question for past elections.
The past polls were taken from Real Clear Politics’ database across 6 election cycles: 2006, 2008, 2010, 2012, 2014 and 2016. Only polls where the year, date range, pollster, sampling universe and sample size are all known were included. Additionally, the polls’ results were transformed to reflect the two-way share for Democrats (Dem/(Dem+Rep)): it is a proportion between 0 and 1. Time is transformed to be the rounded number of weeks between the middle day of the poll and election day. A daily model would be more precise, but would take more data.
In total, 797 polls from 41 pollsters contacting 1.7m respondents over the 6 election cycles were used. These are the 5 largest pollsters. See Appendix B for full details.
For election results, I use both the popular vote share and the seats won. These were taken from Wikipedia: 2006, 2008, 2010, 2012, 2014, and 2016. Again, I use Democrats’ two-way vote share of the popular vote to mimic their two-way support in the polling data, and their percentage share of seats in the Congress.
First, let’s explore the trends over time in each cycle. Here, each point is a poll; it’s size relfects the sample size and color represents the pollster. The dashed line represents the final two-way popular vote share of Democrats. A couple of observations from this are clear. We see that by election, some pollsters are systematically off. For example, the pink pollster in 2010 was consistently below the final election result, suggesting bias. Last, we see that there are trends in results over time. For example, in 2014 the polls got closer and closer to the true result over time. Further investigation shows that poll results are not normally distributed around the result across time, suggesting we will need a time-dependent model.
To estimate bias for each pollster and universe, I use a Bayesian random-walk models anchored to the true final election results. For the first cycle a pollster/universe is used in, its prior is normally distributed around 0pp and assumed to be less than 20pp 95% of the time, in either direction. This prior is updated to be the posterior from the most recent previous cycle the pollster/universe was active in. Full specification of the theoretical model can be found in Appendix A; implementation specifications and key convergance diagnostics can be found in Appendix B.
Below I plot the final bias estimate for each pollster. For example, for a pollster who polled in 2014 but not 2016, this will be their 2014 posterior results. Most pollsters are not biased by more than a percentage point in either direction. ‘Gallup Low-Turnout’ was the mostly conservative estimate (they took 4 polls in 1 election cycle). ‘AP/Ipsos’ most consistently overestimted Democratic support (they took 4 polls in 2 election cycles). POS (R) was the least biased pollseter with an average bias of -0.00025 across their 3 polls in 1 cycles. Full results can be found in Appendix B.
Looking more closely at pollsters that were active in at least 5 of the 6 cycles examine, we some variation in bias across cycles. For example, CBS/NYT strongly oveestimated Democratic support in 2008, but became less and less biased each cycle. Others were too conservative in some cycles and too liberal in others. Fox News underestimated Democratic support in all.
Additionally, we see that most sampling universes also show some overestimation of Democratic support. Our posterior observation from the 2016 cycle shows that likely voter universes across pollsters were biased 0.9pp in favor of Democrats, registered voter universes were biased 1.3pp and samples of just adults were biased nearly 4pp in favor of Democrats. Full results can be found in Appendix B.
These trends were fairly stable over time. The rank order of the universes was the same for all elections except 2006. Both adult and registered voter universes were stable around their final estimate since the 2010 cycle. In 2010, there was basically no bias in likely voter universes, but this increased in the following three elections.
Using the final estimates of bias for pollsters and universes as priors, I now refit the random-walk models, but with no anchor to the true result. This allows us to generate estimates week-by-week for each election, including a final estimate of election outcome, simulating a future prediction. The results are slightly overfit, especially for 2016, since the true results in each election updated the priors which are now inputs to the model. For 2016 specifically, the priors are derived from posterior distribution of the model anchored in the true result, so we should expect the model to be very precise. For full model specification see Appendix A and Appendix B for implementaton, code and full results.
The figure below highlights a few key trends. First, the estimated ‘true’ trendline for each election cycle is shifted slightly below the polls. This is due to the fact that our estimates of universe bias (above) were consistently overestimating Democratic support. Additional bias from polling house effects are split between over- and under-estimating Democratic support.
Second, while there appears to be significant variation over time in the polls, the trendlines are much smoother. In fact, week-over-week, the models estimate 95% of movement is less than 1.5pp. The model used here has one parameter for week-to-week movement, so it averages over big swings (see weeks 63-50 in 2014) and small, incremental changes (see weeks 100-63 in 2014). A model that separates out these trends might better identify which swings are real, substantive movements, and which should be smoothed.
Cycle | Forecast | Popular.Vote | Seat.Share | Forecast.Error |
2006 | 57.2% | 54.1% | 53.6% | -3.1pp |
2008 | 55.1% | 55.5% | 59.1% | 0.5pp |
2010 | 45.7% | 46.5% | 44.4% | 0.8pp |
2012 | 50.3% | 50.6% | 46.2% | 0.3pp |
2014 | 48.3% | 47.1% | 43.2% | -1.2pp |
2016 | 50.9% | 49.4% | 44.6% | -1.5pp |
Third, some cycles’ models are more accurate than others. This table shows the percentage point error of each final forecast. In 2008, 2010 and 2012, my models were very accurate. In 2014 and 2016, they missed by about a point, and in 2006, the forecast missed by 3pp. The error is not consistently biased in one direction, but instead sometimes overestimates and sometimes underestimates Democratic support.
The model that adapts for both past pollster and sampling universe bias is more accurate than a model that accounts for just pollster bias or no bias at all. Over the six cycles analyzed here, there is a reduction in error just over 0.3pp compared to the model with no bias and just under 0.3pp over the model that just accounts for pollster bias.
Lastly, we are interested in the relationship between the final forecasts and the actual share of seats won. Becuase of our winner-take-all and gerrymandered system, the popular vote rarely translates directly into share of seats won. A simple linear regression shows that 69% of the variation in seats won over the last 6 elections is explained by the forecasts produced here. Additionally, Democrats need to be forecasted to win about 52% of the popular vote to win 50% of the seats.
Using past public polling data and election results, I generated bias estimates for each pollster and sampling universe activly polling the generic Congressional ballot over the past six cycles. Generally, pollsters are consistently biased by less than 1pp, but a several still exceed this theshold. Of the pollsters analyzed, I find POS (R) and NBC/WSJ are the least biased pollsters. Likewise, I find that likely voter sampling universes are less biased than registered voter or adult universes. All three universes tend to overestimate the true level of Democratic support.
I then use these (admittedly overfit) estimates of bias to generate forecasts for the six cycles analyzed. The final forecasts had a mean absolute error of about 1.2pp, though five-of-six correctly predicted the winner. These forecasts are fairly predictive of the final seat share Democrats would win.
For a formal model, I follow Jackman (2005) to specify my model to estimate biases, but with an added term for sampling universe. A given poll is assumed to be normally distruted with support as the mean and the variance a function of \(y_i\) and sample size. This would be specified as: \[y_i \sim \mathcal{N}(\mu_i, \sigma^2_i)\] That poll is centered around mean \(\mu_i\), which itself is a function of \(\alpha_t\), the true value of support at the time the poll was taken \(t\), \(\delta_j\), the bias of pollster \(j\), and \(\theta_k\), the bias of sampling universe \(k\). Fully specified, this is: \[\mu_i = \alpha_{t_i} + \delta_{j_i} + \theta_{k_i}\] Due to the trends we see in our initial data exploration, a random walk model is appropriate. In such a model, support at time \(t\) is normally distributed around support at time \(t - 1\). \[ \alpha_t \sim \mathcal{N}(\alpha_{t-1}, \omega^2) \] By anchoring the model in the final election results, and by using a random walk, I will be able to estimate the consistent bias, \(\delta\), of each pollster and the effect, \(\theta\), of different sampling universes.
For these given specifications, we start with the following priors: \[ \sigma^2_i = \frac{y_i(1-y_i)}{n_i},\ \ \ \alpha_1 \sim \mathcal{U}(0.46, 0.56),\ \ \ \omega \sim \mathcal{U}(0, (0.02/1.96))\] \(\sigma^2_i\) just follows the formula for variance of a sample. As a prior for the starting true value of support (\(\alpha_1\)), I use a uniform distribution over the minimum and maximum actual vote share of Democrats in the six elections analyzed. Lastly, as a prior for the true variance of support (\(\omega\)), I use a uniform distribution between 0 and 0.01. A value of 0.01 would reflect that 95% of week-to-week movement is within about 2pp in either direction, a fairly weak assumption. These priors are similar to Strauss (2007)
\[\ \ \ \delta_j \sim \mathcal{N}(0, (0.2/1.96)^2),\ \ \ \theta_k \sim \mathcal{N}(0, (0.2/1.96)^2)\]
For pollster biases (\(\delta\)), I start with a prior that there is no bias with a standard deviation to reflect that bias is 95% of the time within 20pp in either direction; likewise, I start with a prior for bias from sampling universe (\(\theta\)) that is the same. However, these priors are updated based upon the previous election cycle. Thus, this prior is true for the fist cycle the pollster was active in (often 2006). Subsequently, the prior is the mean of the posterior observatons of (\(\delta_{j_{most\ recent\ cycle}}\)) and variance of the same posterior observations.
To fit the final forecasts, without anchoring in the election result, I use the final posterior distributions for each pollster and universe as priors. However, there needs to be some baseline to have a fully specified model. Whereas previously I used the election result as the baseline, in the forecasts, I use the pollster/universe with the smallest variance of the estimate. That is, I take as ground-truth our most certain estimate of bias and calculate the bias of other pollsters/universes relative to that.
library(ggplot2)
library(tidyverse)
library(rjags)
library(cowplot)
library(flextable)
source("forecasting_functions.R")
set.seed(102)
scipen=999
pollster_lkup <- read.csv("data/pollster_lkup.csv")
res <- read.csv("data/election_results.csv") %>%
mutate(twoway_vote = dem_vote/(dem_vote+rep_vote),
twoway_seat = dem_seats/(dem_seats+rep_seats)) %>%
arrange(cycle)
polls <- read.csv("data/past_polls.csv") %>%
mutate(twoway = dem/(dem+rep)) %>%
inner_join(res[,c("cycle","date")], by="cycle") %>%
mutate(week = round(as.numeric((as.Date(as.character(date), format="%m/%d/%y") -
as.Date(as.character(end_date), format="%m/%d/%y")) +
(as.Date(as.character(end_date), format="%m/%d/%y") -
as.Date(as.character(start_date), format="%m/%d/%y"))/2)/7),
n_size = as.numeric(as.character(n_size)))
polling_summary <- polls %>%
group_by(pollster) %>%
summarise(`Total N-Size` = sum(n_size),
`# of Polls` = n(),
`# of Cycles` = length(unique(cycle))) %>%
arrange(desc(`Total N-Size`)) %>%
inner_join(pollster_lkup, by = "pollster") %>%
mutate(pollster_raw = factor(pollster_raw, levels = pollster_raw[order(`Total N-Size`)]))
polling_summary_ft <- polling_summary %>%
mutate(nsize = as.character(`Total N-Size`),
polls = `# of Polls`,
cycles = `# of Cycles`) %>%
select(pollster_raw, nsize, polls, cycles)
FT2 <- flextable(polling_summary_ft)
FT2 <- set_header_labels(FT2, pollster_raw = "Pollster", nsize = "Total N-Size", polls = "# of Polls", cycles = "# of Cycles")
FT2 <- theme_zebra(x = FT2, odd_header = "#CFCFCF", odd_body = "#EFEFEF",
even_header = "transparent", even_body = "transparent")
FT2 <- align(x = FT2, j = 1, align = "left", part = "all")
FT2 <- align(x = FT2, j = 2:4, align = "center", part = "all")
FT2 <- bold(x = FT2, bold = TRUE, part = "header")
FT2
Pollster | Total N-Size | # of Polls | # of Cycles |
Rasmussen | 1140483 | 318 | 6 |
Quinnipiac | 61471 | 30 | 5 |
Gallup | 56170 | 35 | 5 |
Democracy Corps (D) | 40457 | 42 | 5 |
Fox News | 40139 | 43 | 5 |
Pew | 29423 | 20 | 5 |
Reuters/Ipsos | 28448 | 26 | 3 |
PPP (D) | 28410 | 32 | 4 |
CNN/ORC | 26294 | 33 | 6 |
NBC/WSJ | 21483 | 23 | 4 |
Politico/GWU/Battleground | 18000 | 18 | 3 |
USA Today/Gallup | 16739 | 17 | 4 |
GWU/Battleground | 13800 | 14 | 4 |
Economist/YouGov | 12686 | 10 | 1 |
CBS/NYT | 12668 | 14 | 5 |
Bloomberg | 10153 | 12 | 4 |
McClatchy/Marist | 9376 | 11 | 4 |
ABC/WaPo | 9226 | 11 | 5 |
Newsweek | 8796 | 11 | 4 |
Gallup High-Turnout | 7724 | 4 | 1 |
Gallup Low-Turnout | 7724 | 4 | 1 |
Diageo/Hotline | 5690 | 7 | 2 |
Time | 5646 | 6 | 3 |
AP/GFK | 5536 | 6 | 4 |
Ipsos/McClatchy | 5504 | 6 | 1 |
NPR | 5090 | 6 | 3 |
Battleground | 5018 | 5 | 3 |
AP/Ipsos | 4045 | 4 | 2 |
USA Today/Pew | 3912 | 3 | 1 |
Cook/RT Strategies | 3302 | 4 | 1 |
LAT/Bloomberg | 3215 | 3 | 2 |
Resurgent Republic (R) | 3000 | 3 | 1 |
POS (R) | 2500 | 3 | 1 |
National Journal/FD | 2316 | 2 | 1 |
Zogby | 2013 | 2 | 1 |
McLaughlin (R) | 2000 | 2 | 1 |
Hotline/FD | 1428 | 3 | 1 |
Reason | 1003 | 1 | 1 |
Harris | 1001 | 1 | 1 |
Winston (R) | 1000 | 1 | 1 |
USA Today/PSRAI | 697 | 1 | 1 |
two_sigma = 0.2
sigma2 = (two_sigma/1.96)^2
deltas <- data.frame(delta_cycle = 0,
delta_pollster = unique(polls$pollster),
delta_mu = rep(0, length(unique(polls$pollster))),
delta_sigma2 = rep(sigma2, length(unique(polls$pollster))))
deltas_all <- deltas
thetas <- data.frame(theta_cycle = 0,
theta_univ = unique(polls$univ),
theta_mu = rep(0, length(unique(polls$univ))),
theta_sigma2 = rep(sigma2, length(unique(polls$univ))))
thetas_all <- thetas
convergence <- list()
#Estimation
for(cycle in res$cycle) {
data_jags <- data_prep(data = polls, res = res, year = cycle, anchor = T)
data_jags <- bias_priors(data_jags = data_jags, deltas = deltas, thetas = thetas, anchor = T)
convergence[[paste(cycle)]] <- convergence_diagnostics(data_jags = data_jags,
anchor = T,
chains = 4,
thining = 10,
burnin = 10000,
iter = 1000000)
mod_res <- run_model(data_jags = data_jags,
anchor = T,
chains = 4,
thining = 10,
burnin = 10000,
iter = 1000000,
params = c("delta", "theta"))
prior_ests <- calculate_priors(mod_res = mod_res, year = cycle, data_jags = data_jags, anchor = T)
new_priors <- update_priors(deltas_all = deltas_all, thetas_all = thetas_all,
deltas_new = prior_ests$deltas_est, thetas_new = prior_ests$thetas_est)
deltas <- new_priors$deltas
deltas_all <- new_priors$deltas_all
thetas <- new_priors$thetas
thetas_all <- new_priors$thetas_all
}
deltas <- deltas %>%
arrange(delta_mu) %>%
inner_join(pollster_lkup, by = c("delta_pollster" = "pollster")) %>%
mutate(pollster_raw = factor(pollster_raw, levels = pollster_raw[order(delta_mu)]))
deltas_all <- deltas_all %>%
inner_join(pollster_lkup, by = c("delta_pollster" = "pollster"))
thetas <- thetas %>%
arrange(theta_mu) %>%
mutate(theta_univ = factor(theta_univ, levels = theta_univ[order(theta_mu)]))
## Convergance diagnostics for sample 2006 parameters:
## Potential scale reduction factors:
##
## Point est. Upper C.I.
## delta[10] 1 1
## theta[3] 1 1
## xi[9] 1 1
##
## Multivariate psrf
##
## 1
## delta[10] theta[3] xi[9]
## Lag 0 1.00000000 1.000000000 1.0000000000
## Lag 10 0.54614410 0.842414407 0.6479339022
## Lag 50 0.35746885 0.570889504 0.2770018344
## Lag 100 0.21266196 0.346365247 0.0977201984
## Lag 500 0.00476482 0.008362083 0.0003705629
## Convergance diagnostics for sample 2008 parameters:
## Potential scale reduction factors:
##
## Point est. Upper C.I.
## delta[2] 1 1
## theta[2] 1 1
## xi[16] 1 1
##
## Multivariate psrf
##
## 1
## delta[2] theta[2] xi[16]
## Lag 0 1.0000000000 1.000000000 1.000000000
## Lag 10 0.2008663658 0.810591275 0.546339430
## Lag 50 0.0523200172 0.352206104 0.281306715
## Lag 100 0.0051833456 0.126195478 0.118530308
## Lag 500 -0.0002175812 0.002406398 0.003007152
## Convergance diagnostics for sample 2010 parameters:
## Potential scale reduction factors:
##
## Point est. Upper C.I.
## delta[10] 1 1
## theta[3] 1 1
## xi[96] 1 1
##
## Multivariate psrf
##
## 1
## delta[10] theta[3] xi[96]
## Lag 0 1.000000000 1.00000000 1.000000000
## Lag 10 0.191602780 0.70567278 0.497069151
## Lag 50 0.047676821 0.45326537 0.356198091
## Lag 100 0.010272039 0.29748969 0.240206424
## Lag 500 -0.002468651 0.01169564 0.006316747
## Convergance diagnostics for sample 2012 parameters:
## Potential scale reduction factors:
##
## Point est. Upper C.I.
## delta[15] 1 1
## theta[3] 1 1
## xi[45] 1 1
##
## Multivariate psrf
##
## 1
## delta[15] theta[3] xi[45]
## Lag 0 1.0000000000 1.0000000000 1.000000000
## Lag 10 0.0233816404 0.3486880276 0.266722290
## Lag 50 0.0081948753 0.1401180564 0.106308451
## Lag 100 0.0048805013 0.0415554326 0.035292227
## Lag 500 -0.0002641986 -0.0001903756 0.001612574
## Convergance diagnostics for sample 2014 parameters:
## Potential scale reduction factors:
##
## Point est. Upper C.I.
## delta[9] 1 1
## theta[2] 1 1
## xi[86] 1 1
##
## Multivariate psrf
##
## 1
## delta[9] theta[2] xi[86]
## Lag 0 1.0000000000 1.000000000 1.000000e+00
## Lag 10 0.0114703204 0.551802331 2.188369e-01
## Lag 50 0.0006921544 0.119901657 5.559910e-02
## Lag 100 -0.0008374459 0.021559699 8.840482e-03
## Lag 500 -0.0001438989 -0.001279462 -7.920388e-05
## Convergance diagnostics for sample 2016 parameters:
## Potential scale reduction factors:
##
## Point est. Upper C.I.
## delta[10] 1 1
## theta[2] 1 1
## xi[39] 1 1
##
## Multivariate psrf
##
## 1
## delta[10] theta[2] xi[39]
## Lag 0 1.000000e+00 1.000000000 1.0000000000
## Lag 10 4.359223e-02 0.099004583 0.1572074203
## Lag 50 3.544292e-03 0.010580444 0.0134616524
## Lag 100 5.696531e-04 0.003168989 0.0008130858
## Lag 500 -8.839755e-05 -0.001716413 -0.0017817737
## Final estimates of pollster bias:
Pollster | Cycle | Bias | Variance |
Gallup Low-Turnout | 2010 | -0.060 | 0.000 |
Gallup High-Turnout | 2010 | -0.033 | 0.000 |
Resurgent Republic (R) | 2012 | -0.030 | 0.000 |
Fox News | 2016 | -0.017 | 0.000 |
NPR | 2014 | -0.015 | 0.000 |
Reason | 2014 | -0.015 | 0.000 |
USA Today/Gallup | 2012 | -0.013 | 0.000 |
Reuters/Ipsos | 2016 | -0.013 | 0.000 |
McLaughlin (R) | 2010 | -0.012 | 0.000 |
Gallup | 2014 | -0.012 | 0.000 |
Winston (R) | 2010 | -0.011 | 0.000 |
CNN/ORC | 2016 | -0.011 | 0.000 |
AP/GFK | 2016 | -0.010 | 0.000 |
Bloomberg | 2016 | -0.009 | 0.000 |
ABC/WaPo | 2016 | -0.008 | 0.000 |
Politico/GWU/Battleground | 2014 | -0.007 | 0.000 |
LAT/Bloomberg | 2008 | -0.006 | 0.000 |
Quinnipiac | 2016 | -0.005 | 0.000 |
GWU/Battleground | 2016 | -0.004 | 0.000 |
Zogby | 2006 | -0.004 | 0.001 |
Pew | 2014 | -0.003 | 0.000 |
McClatchy/Marist | 2016 | -0.003 | 0.000 |
Time | 2010 | -0.003 | 0.000 |
Democracy Corps (D) | 2014 | -0.003 | 0.000 |
PPP (D) | 2016 | -0.002 | 0.000 |
Battleground | 2010 | -0.002 | 0.000 |
Rasmussen | 2016 | -0.002 | 0.000 |
NBC/WSJ | 2016 | -0.001 | 0.000 |
POS (R) | 2010 | -0.000 | 0.000 |
Hotline/FD | 2006 | 0.004 | 0.001 |
CBS/NYT | 2016 | 0.004 | 0.000 |
Ipsos/McClatchy | 2010 | 0.004 | 0.000 |
Economist/YouGov | 2016 | 0.007 | 0.000 |
Harris | 2006 | 0.007 | 0.001 |
Diageo/Hotline | 2010 | 0.009 | 0.000 |
USA Today/Pew | 2014 | 0.010 | 0.000 |
National Journal/FD | 2010 | 0.011 | 0.000 |
Newsweek | 2012 | 0.014 | 0.000 |
USA Today/PSRAI | 2014 | 0.017 | 0.000 |
Cook/RT Strategies | 2006 | 0.022 | 0.001 |
AP/Ipsos | 2008 | 0.033 | 0.000 |
## Estimate for each pollster and cycle:
## (If a pollster is missing from a cycle, it did not poll).
Pollster | Cycle | Bias | Variance |
ABC/WaPo | 2006 | -0.008 | 0.001 |
AP/Ipsos | 2006 | 0.018 | 0.001 |
Battleground | 2006 | -0.016 | 0.001 |
CBS/NYT | 2006 | 0.034 | 0.001 |
CNN/ORC | 2006 | 0.007 | 0.001 |
Cook/RT Strategies | 2006 | 0.022 | 0.001 |
Democracy Corps (D) | 2006 | -0.007 | 0.001 |
Fox News | 2006 | -0.003 | 0.001 |
Gallup | 2006 | 0.013 | 0.001 |
Harris | 2006 | 0.007 | 0.001 |
Hotline/FD | 2006 | 0.004 | 0.001 |
LAT/Bloomberg | 2006 | 0.004 | 0.001 |
NBC/WSJ | 2006 | 0.005 | 0.001 |
Newsweek | 2006 | 0.014 | 0.001 |
Pew | 2006 | 0.006 | 0.001 |
Quinnipiac | 2006 | 0.002 | 0.001 |
Rasmussen | 2006 | 0.001 | 0.001 |
Time | 2006 | 0.014 | 0.001 |
USA Today/Gallup | 2006 | -0.024 | 0.001 |
Zogby | 2006 | -0.004 | 0.001 |
ABC/WaPo | 2008 | -0.013 | 0.000 |
AP/GFK | 2008 | 0.003 | 0.000 |
AP/Ipsos | 2008 | 0.033 | 0.000 |
Battleground | 2008 | -0.008 | 0.000 |
CBS/NYT | 2008 | 0.039 | 0.000 |
CNN/ORC | 2008 | -0.006 | 0.000 |
Democracy Corps (D) | 2008 | -0.016 | 0.000 |
Diageo/Hotline | 2008 | -0.024 | 0.000 |
Fox News | 2008 | -0.014 | 0.000 |
Gallup | 2008 | 0.009 | 0.000 |
GWU/Battleground | 2008 | -0.009 | 0.000 |
LAT/Bloomberg | 2008 | -0.006 | 0.000 |
NBC/WSJ | 2008 | 0.013 | 0.000 |
Newsweek | 2008 | 0.003 | 0.000 |
Pew | 2008 | 0.007 | 0.000 |
Rasmussen | 2008 | -0.001 | 0.000 |
Time | 2008 | 0.005 | 0.000 |
USA Today/Gallup | 2008 | -0.032 | 0.000 |
ABC/WaPo | 2010 | 0.007 | 0.000 |
AP/GFK | 2010 | -0.004 | 0.000 |
Battleground | 2010 | -0.002 | 0.000 |
Bloomberg | 2010 | 0.007 | 0.000 |
CNN/ORC | 2010 | -0.009 | 0.000 |
Democracy Corps (D) | 2010 | 0.001 | 0.000 |
Diageo/Hotline | 2010 | 0.009 | 0.000 |
Fox News | 2010 | -0.027 | 0.000 |
Gallup | 2010 | -0.011 | 0.000 |
Gallup High-Turnout | 2010 | -0.033 | 0.000 |
Gallup Low-Turnout | 2010 | -0.060 | 0.000 |
GWU/Battleground | 2010 | 0.004 | 0.000 |
Ipsos/McClatchy | 2010 | 0.004 | 0.000 |
McClatchy/Marist | 2010 | -0.006 | 0.000 |
McLaughlin (R) | 2010 | -0.012 | 0.000 |
National Journal/FD | 2010 | 0.011 | 0.000 |
Newsweek | 2010 | 0.016 | 0.000 |
NPR | 2010 | -0.014 | 0.000 |
Pew | 2010 | 0.002 | 0.000 |
Politico/GWU/Battleground | 2010 | 0.010 | 0.000 |
POS (R) | 2010 | -0.000 | 0.000 |
PPP (D) | 2010 | -0.005 | 0.000 |
Quinnipiac | 2010 | -0.005 | 0.000 |
Rasmussen | 2010 | -0.024 | 0.000 |
Reuters/Ipsos | 2010 | 0.001 | 0.000 |
Time | 2010 | -0.003 | 0.000 |
USA Today/Gallup | 2010 | -0.014 | 0.000 |
Winston (R) | 2010 | -0.011 | 0.000 |
Bloomberg | 2012 | -0.007 | 0.000 |
CBS/NYT | 2012 | 0.028 | 0.000 |
CNN/ORC | 2012 | -0.011 | 0.000 |
Democracy Corps (D) | 2012 | -0.001 | 0.000 |
Gallup | 2012 | -0.012 | 0.000 |
McClatchy/Marist | 2012 | -0.014 | 0.000 |
Newsweek | 2012 | 0.014 | 0.000 |
NPR | 2012 | -0.017 | 0.000 |
Pew | 2012 | -0.003 | 0.000 |
Politico/GWU/Battleground | 2012 | -0.005 | 0.000 |
PPP (D) | 2012 | 0.003 | 0.000 |
Quinnipiac | 2012 | -0.004 | 0.000 |
Rasmussen | 2012 | -0.009 | 0.000 |
Resurgent Republic (R) | 2012 | -0.030 | 0.000 |
Reuters/Ipsos | 2012 | -0.020 | 0.000 |
USA Today/Gallup | 2012 | -0.013 | 0.000 |
ABC/WaPo | 2014 | -0.004 | 0.000 |
AP/GFK | 2014 | -0.015 | 0.000 |
Bloomberg | 2014 | -0.008 | 0.000 |
CBS/NYT | 2014 | 0.008 | 0.000 |
CNN/ORC | 2014 | -0.010 | 0.000 |
Democracy Corps (D) | 2014 | -0.003 | 0.000 |
Fox News | 2014 | -0.020 | 0.000 |
Gallup | 2014 | -0.012 | 0.000 |
GWU/Battleground | 2014 | -0.007 | 0.000 |
McClatchy/Marist | 2014 | -0.007 | 0.000 |
NBC/WSJ | 2014 | 0.011 | 0.000 |
NPR | 2014 | -0.015 | 0.000 |
Pew | 2014 | -0.003 | 0.000 |
Politico/GWU/Battleground | 2014 | -0.007 | 0.000 |
PPP (D) | 2014 | -0.002 | 0.000 |
Quinnipiac | 2014 | -0.004 | 0.000 |
Rasmussen | 2014 | -0.001 | 0.000 |
Reason | 2014 | -0.015 | 0.000 |
USA Today/Pew | 2014 | 0.010 | 0.000 |
USA Today/PSRAI | 2014 | 0.017 | 0.000 |
ABC/WaPo | 2016 | -0.008 | 0.000 |
AP/GFK | 2016 | -0.010 | 0.000 |
Bloomberg | 2016 | -0.009 | 0.000 |
CBS/NYT | 2016 | 0.004 | 0.000 |
CNN/ORC | 2016 | -0.011 | 0.000 |
Economist/YouGov | 2016 | 0.007 | 0.000 |
Fox News | 2016 | -0.017 | 0.000 |
GWU/Battleground | 2016 | -0.004 | 0.000 |
McClatchy/Marist | 2016 | -0.003 | 0.000 |
NBC/WSJ | 2016 | -0.001 | 0.000 |
PPP (D) | 2016 | -0.002 | 0.000 |
Quinnipiac | 2016 | -0.005 | 0.000 |
Rasmussen | 2016 | -0.002 | 0.000 |
Reuters/Ipsos | 2016 | -0.013 | 0.000 |
## Final estimates of sampling universe bias:
Sampling Universe | Cycle | Bias | Variance |
LV | 2016 | 0.009 | 0 |
RV | 2016 | 0.013 | 0 |
Adults | 2016 | 0.039 | 0 |
## Estimate for each universe and cycle:
theta_univ | theta_cycle | Bias | Variance |
Adults | 2006 | 0.030 | 6e-04 |
LV | 2006 | 0.033 | 5e-04 |
RV | 2006 | 0.031 | 6e-04 |
Adults | 2008 | 0.026 | 4e-04 |
LV | 2008 | 0.009 | 1e-04 |
RV | 2008 | 0.015 | 1e-04 |
Adults | 2010 | 0.043 | 1e-04 |
LV | 2010 | -0.002 | 0 |
RV | 2010 | 0.026 | 0 |
Adults | 2012 | 0.037 | 1e-04 |
LV | 2012 | 0.004 | 0 |
RV | 2012 | 0.015 | 0 |
Adults | 2014 | 0.037 | 1e-04 |
LV | 2014 | 0.007 | 0 |
RV | 2014 | 0.015 | 0 |
Adults | 2016 | 0.039 | 0 |
LV | 2016 | 0.009 | 0 |
RV | 2016 | 0.013 | 0 |
all_cycle_est <- data.frame(iter_mean = numeric(0),
iter_sigma2 = numeric(0),
time_before_elec = numeric(0),
upper_bound = numeric(0),
lower_bound = numeric(0),
cycle = numeric(0))
omegas <- c()
for(cycle in res$cycle) {
data_jags <- data_prep(data = polls, res = res, year = cycle, anchor = F)
data_jags <- bias_priors(data_jags = data_jags, deltas = deltas, thetas = thetas, anchor = F)
mod_res <- run_model(data_jags = data_jags,
anchor = F,
params = c("xi", "omega"),
chains = 4,
thining = 10,
burnin = 10000,
iter = 1000000)
cycle_time_est <- extract_time_est(mod_res = mod_res, year = cycle, data_jags = data_jags)
all_cycle_est <- rbind(all_cycle_est, cycle_time_est)
omegas <- c(omegas, paste(extract_omega_est(mod_res = mod_res, year = cycle, data_jags = data_jags)))
}